Data-driven decisions can be suboptimal when the data are distorted by fraudulent behaviour. Fraud is a common occurrence in finance or other related industries, where large datasets are handled and motivation for financial gain may be high. In order to detect and the prevent fraud, quantitative methods are used. Fraud, however, is also committed in other circumstances, e.g. during clinical trials. The article aims to verify which analytical fraud- -detection methods used in finance may be adopted in the field of clinical trials. We systematically reviewed papers published over the last five years in two databases (Scopus and the Web of Science) in the field of economics, finance, management and business in general. We considered a broad scope of data mining techniques including artificial intelligence algorithms. As a result, 37 quantitative methods were identified with the potential of being fit for application in clinical trials. The methods were grouped into three categories: pre- -processing techniques, supervised learning and unsupervised learning. Our findings may enhance the future use of fraud-detection methods in clinical trials.
fraud detection, clinical trials, finance, data mining, big data
C00, C38, C55
Abdul Jabbar, M., & Suharjito. (2020). Fraud detection call detail record using machine learning in telecommunications company. Advances in Science, Technology and Engineering Systems, 5(4), 63–69. https://doi.org/10.25046/aj050409.
Al-Hashedi, K. G., & Magalingam, P. (2021). Financial fraud detection applying data mining techniques: A comprehensive review from 2009 to 2019. Computer Science Review, 40, 1–23. https://doi.org/10.1016/j.cosrev.2021.100402.
Bach, M. P., Ćurlin, T., Dumičić, K., Zoroja, J., & Žmuk, B. (2020). Data mining approach to internal fraud in a project-based organization. International Journal of Information Systems and Project Management, 8(2), 81–101. https://doi.org/10.12821/ijispm080204.
Bach, M. P., Vlahović, N., & Pivar, J. (2020). Fraud Prevention in the Leasing Industry Using the Kohonen Self-Organising Maps. Organizacija. Journal of Management, Informatics and Human Resources, 53(2), 128–145. https://doi.org/10.2478/orga-2020-0009.
Barabesi, L., Cerasa, A., Cerioli, A., & Perrotta, D. (2021). On Characterizations and Tests of Benford’s Law. Journal of the American Statistical Association. Advance online publication. https://doi.org/10.1080/01621459.2021.1891927.
Botev, Z. I., Grotowski, J. F., & Kroese, D. P. (2010). Kernel density estimation via diffusion. Annals of Statistics, 38(5), 2916–2957. https://doi.org/10.1214/10-AOS799.
Ekin, T., Frigau, L., & Conversano, C. (2021). Healthcare fraud classifiers in practice. Applied Stochastic Models in Business and Industry, 37(6), 1182–1199. https://doi.org/10.1002/asmb.2633.
Esen, M. F., Bilgic, E., & Basdas, U. (2019). How to detect illegal corporate insider trading? A data mining approach for detecting suspicious insider transactions. Intelligent Systems in Accounting, Finance and Management, 26(2), 60–70. https://doi.org/10.1002/isaf.1446.
Eshghi, A., & Kargari, M. (2019). Introducing a new method for the fusion of fraud evidence in banking transactions with regards to uncertainty. Expert Systems with Applications, 121, 382– 392. https://doi.org/10.1016/j.eswa.2018.11.039.
Farrugia, S., Ellul, J., & Azzopardi, G. (2020). Detection of illicit accounts over the Ethereum blockchain. Expert Systems with Applications, 150, 1–11. https://doi.org/10.1016/j.eswa.2020.113318.
Federal Trade Commission. (2022, 22 February). New Data Shows FTC Received 2.8 Million Fraud Reports from Consumers in 2021. https://www.ftc.gov/news-events/news/press-releases/2022/02/new-data-shows-ftc-received-28-million-fraud-reports-consumers-2021-0.
Fiore, U., De Santis, A., Perla, F., Zanetti, P., & Palmieri, F. (2019). Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Information Sciences, 479, 448–455. https://doi.org/10.1016/j.ins.2017.12.030.
Gupta, A. (2013). Fraud and misconduct in clinical research: A concern. Perspectives in Clinical Research, 4(2), 144–147. https://doi.org/10.4103/2229-3485.111800.
Höppner, S., Baesens, B., Verbeke, W., & Verdonck, T. (2022). Instance-dependent cost-sensitive learning for detecting transfer fraud. European Journal of Operational Research, 297(1), 291– 300. https://doi.org/10.1016/j.ejor.2021.05.028.
International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use. (2016). Integrated Addendum to ICH E6(R1): Guideline for Good Clinical Practice E6(R2). https://database.ich.org/sites/default/files/E6_R2_Addendum.pdf.
Kamalov, F. (2020). Kernel density estimation based sampling for imbalanced class distribution. Information Sciences, 512, 1192–1201. https://doi.org/10.1016/j.ins.2019.10.017.
Kirkwood, A. A., Cox, T., & Hackshaw, A. (2013). Application of methods for central statistical monitoring in clinical trials. Clinical Trials, 10(5), 783–806. https://doi.org/10.1177/1740774513494504.
Majadi, N., Trevathan, J., & Bergmann, N. (2019). Collusive shill bidding detection in online auctions using Markov Random Field. Electronic Commerce Research and Applications, 34, 1– 13. https://doi.org/10.1016/j.elerap.2019.100831.
Przekop, D. (2020). Feature Engineering for Anti-Fraud Models Based on Anomaly Detection. Central European Journal of Economic Modelling and Econometrics, 12(3), 301–316. https://doi.org/10.24425/cejeme.2020.134750.
PwC. (2022). PwC’s Global Economic Crime and Fraud Survey 2022. https://www.pwc.com/gx/en/forensics/gecsm-2022/PwC-Global-Economic-Crime-and-Fraud-Survey-2022.pdf.
Rousseeuw, P., Perrotta, D., Riani, M., & Hubert, M. (2019). Robust Monitoring of Time Series with Application to Fraud Detection. Econometrics and Statistics, 9, 108–121. https://doi.org/10.1016/j.ecosta.2018.05.001.
Sakamoto, J., & Buyse, M. (2016). Fraud in clinical trials: complex problem, simple solutions?. International Journal of Clinical Oncology, 21(1), 13–14. https://doi.org/10.1007/s10147-015-0922-4.
Srinivasan, S., & Kamalakannan, T. (2018). Multi Criteria Decision Making in Financial Risk Management with a Multi-objective Genetic Algorithm. Computational Economics, 52(2), 443– 457. https://doi.org/10.1007/s10614-017-9683-7.
Venet, D., Doffagne, E., Burzykowski, T., Beckers, F., Tellier, Y., Genevois-Marlin, E., Becker, U., Bee, V., Wilson, V., Legrand, C., & Buyse, M. (2012). A statistical approach to central monitoring of data quality in clinical trials. Clinical Trials, 9(6), 705–713. https://doi.org/10.1177/1740774512447898.
Wang, Y., & Xu, W. (2018). Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decision Support Systems, 105, 87–95. https://doi.org/10.1016/j.dss.2017.11.001.
West, J., & Bhattacharya, M. (2016). Intelligent financial fraud detection: A comprehensive review. Computers and Security, 57, 47–66. https://doi.org/10.1016/j.cose.2015.09.005.
Zafari, B., & Ekin, T. (2019). Topic modelling for medical prescription fraud and abuse detection. Journal of the Royal Statistical Society. Series C: Applied Statistics, 68(3), 751–769. https://doi.org/10.1111/rssc.12332.
Zhang, Y., Hu, A., Wang, J., & Zhang, Y. (2022). Detection of fraud statement based on word vector: Evidence from financial companies in China. Finance Research Letters, 46B, 1–7. https://doi.org/10.1016/j.frl.2021.102477.
Zhou, W., & Kapoor, G. (2011). Detecting evolutionary financial statement fraud. Decision Support Systems, 50(3), 570–575. https://doi.org/10.1016/j.dss.2010.08.007.